Strategy recovery for stochastic mean payoff games

نویسنده

  • Marcello Mamino
چکیده

We prove that to find optimal positional strategies for stochastic mean payoff games when the value of every state of the game is known, in general, is as hard as solving such games tout court. This answers a question posed by Daniel Andersson and Peter Bro Miltersen. In this note, we consider perfect information 0-sum stochastic games, which, for short, we will just call stochastic games. For us, a stochastic game is a finite directed graph whose vertices we call states and whose edges we call transitions, multiple edges and loops are allowed but no state can be a sink. To each state s is associated an owner o(s) which is one of the two players Max and Min. Each transition s A,p −−→t has an action A and a probability p ∈ Q∩ [0, 1], with the condition that, for each state s, the probabilities of the transitions exiting s associated to the same action must sum to 1. We say that the action A is available at state s if one of the transitions exiting s is associated to A. Furthermore to each action A is associated a reward r(A) ∈ Q. A play of a stochastic game G begins in some state s0 and produces an unending sequence of states {si}i∈N and actions {Ai}i∈N. At move i, the owner of the current state si chooses an action Ai among those available at si, then one of the transitions exiting si with action Ai is selected at random according to their respective probabilities, and the next state si+1 is the destination of the chosen transition. A play can be evaluated according to the β-discounted payoff criterion vβ (A0,A1 . . .) = (1−β) ∞ ∑

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Complexity of Solving Stochastic Games on Graphs

We consider some well-known families of two-player zero-sum perfect-information stochastic games played on finite directed graphs. Generalizing and unifying results of Liggett and Lippman, Zwick and Paterson, and Chatterjee and Henzinger, we show that the following tasks are polynomial-time (Turing) equivalent. – Solving stochastic parity games, – Solving simple stochastic games, – Solving stoc...

متن کامل

Games through Nested Fixpoints

In this paper we consider two-player zero-sum payoff games on finite graphs, both in the deterministic as well as in the stochastic setting. In the deterministic setting, we consider total-payoff games which have been introduced as a refinement of mean-payoff games [18, 10]. In the stochastic setting, our class is a turn-based variant of liminf-payoff games [15, 16, 4]. In both settings, we pro...

متن کامل

Reduction of stochastic parity to stochastic mean-payoff games

A stochastic graph game is played by two players on a game graph with probabilistic transitions. We consider stochastic graph games with ω-regular winning conditions specified as parity objectives, and mean-payoff (or limit-average) objectives. These games lie in NP ∩ coNP. We present a polynomial-time Turing reduction of stochastic parity games to stochastic mean-payoff games.

متن کامل

On the computational complexity of solving stochastic mean-payoff games

We consider some well known families of two-player, zero-sum, turn-based, perfect information games that can be viewed as specical cases of Shapley’s stochastic games. We show that the following tasks are polynomial time equivalent: • Solving simple stochastic games, • solving stochastic mean-payoff games with rewards and probabilities given in unary, and • solving stochastic mean-payoff games ...

متن کامل

PRISM-Games 2.0: A Tool for Multi-objective Strategy Synthesis for Stochastic Games

We present a new release of PRISM-games, a tool for verification and strategy synthesis for stochastic games. PRISM-games 2.0 significantly extends its functionality by supporting, for the first time: (i) long-run average (mean-payoff) and ratio reward objectives, e.g., to express energy consumption per time unit; (ii) strategy synthesis and Pareto set computation for multi-objective properties...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Theor. Comput. Sci.

دوره 675  شماره 

صفحات  -

تاریخ انتشار 2017